Concepedia

Concept

speech acoustics

Parents

9.5K

Publications

562.3K

Citations

17.2K

Authors

3.2K

Institutions

Integrated Prosodic Coding

1959 - 1965

During 1959–1965, research converged on treating prosody as an integrated, multi-channel code in which fundamental frequency, amplitude, and duration jointly signal intonation and stress across languages. Onset timing and voicing contrasts emerged as central cues for differentiating speech events and articulatory actions, guiding experimental designs and early perceptual judgments. Measurement innovations such as cinefluorography and electromyography began to connect articulation with acoustic output, while binaural psychoacoustics offered a practical framework for understanding perceptual masking and separability; physiological state, including aging and laryngeal factors, modulated pitch and voice quality measures. Historical Significance: These developments established core prosodic cues and cross-language voicing patterns that underpinned later theories of perception, speech processing, and prosodic rhythm. The introduction of articulatory-to-acoustic linkage methods and measurement tools created enduring datasets and methodologies that influenced multimodal phonetics and intelligibility research. Foundational studies on vowel duration, word stress cues, and cross-language voicing provided benchmarks for future research in speech synthesis, language processing, and hearing sciences.

Prosody and stress patterns emerge as a multi-channel acoustic code: F0, amplitude, and duration jointly signal intonation and stress in both English and cross-linguistic contexts, suggesting integrated perceptual cues [1][2][19][14].

Temporal dynamics and onset timing act as central cues for distinguishing speech events and articulatory actions, as shown by onset-time discrimination, voicing contrasts in initial stops, and intervocalic timing patterns across contexts [3][6][7].

Measurement technologies bridge articulation and acoustics: cinefluorography visualizes tongue/mouth movements, electromyography records muscle activity, and velopharyngeal closure links articulatory gestures to vowel quality [5][8][12].

Binaural psychoacoustics and masking-level differences become a guiding framework, combining equalization/cancellation models with empirical data to explain perceptual separability and masking [18][20].

Phonation, laryngeal correlates, and aging shape acoustic measures of pitch, periodicity, and voice quality, illustrating how physiological state modulates acoustic patterns and perceptual judgments [10][15][11].

Parametric Speech Modeling

1966 - 1972

Dynamic Formant-Transition Perception

1973 - 1979

Gestural Articulatory Phonology

1980 - 1987

Contextual Acoustic Modeling

1988 - 1994

Spectrotemporal Phonetics and Statistical Learning

1995 - 2001

Contextual Hidden Markov Models

2002 - 2008

Adaptive Speech Perception

2009 - 2016

Neural Speech Systems

2017 - 2023